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The main purpose of the research reported in this document was to discover 
whether controlled experiments can be conducted on the relations between people 
and the complex computing systems which they use. Three increasingly complex 
experiments were designed to test the effect of varying delays of computer 
response on the number of commands issues per minute, as well as the total time 
needed to complete a task. The system used was a time-shared, on-line TX-2 
computer and the Lincoln Reckoner, a subset of the programs in the executive system 
known as APEX. It was hoped that the experiments would not only further the 
knowledge of how people solve problems, but also aid in the design of new systems. 
The results indicate not only the feasibility of testing man-computer interaction, but 
also demonstrate more clearly the differences between the subjects* behavior in the 
various tasks in such indices of performance as net completion time and number of 
outputs. In addition, the gross completion time curves and output rate curves indicate 
that experiments large enough to produce stable curves would be feasible. 
Appendices, a reference list, diagrams, charts, tables and graphs are included in the 
document. (SH) 
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Three experiments explored t!_„ way in which delay in the response of the 
system affects the user’s performance in solving problems with an on-line 
computing sendee. Each experiment was more ambitious than the preceding: the 
subject's task was more realistic and more complex. In each experiment there 
were four subjects under delay conditions of about 1 sec. to 100 sec. The on-line 
computing service was the Lincoln Reckoner. 

As expected, the average time the user required to complete a task increased 
as the response -delay increased, and the rate at which he demanded service de- 
clined as the delay increased. The relation ol net completion time (time to complete 
the task, minus the time during which the user was waiting for a response) to 
response delay depended on the type of task. In the more realistic experiments, 
the net completion time increased with delay (suggesting that long delays are 
distracting). The number of outputs (i. e. , displays or type -outs) per task was 
also considered. 

The main conclusion is that controlled experiments of this kind are feasible 
and can be used as the basis for design of on-line computing services. 

Accepted for the Air Force 

Franklin C. Hudson 

Chief, Lincoln Laboratory Office 
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INTRODUCTION 



This note presents three experiments on the effects of response -delays in a time- 
shared, on-line computing system. The main purpose was to discover whether 
controlled experiments can be conducted on the relations between people and the 
complex computing systems with which they work. We hoped to find out whether 
such human factors experiments, necessarily based on the complexity of interactions 
between men and systems, are feasible. If they are, the results should be useful 
both in the design of computing systems and in achieving a better understanding of 
how people solve problems. 

It is important to note that these experiments investigated the use of the computer 
as a computing device rather than as a programming device. It is not at all necessary 
to know how to "program," in the sense of compiler language or machine language, 
in order to solve substantive problems with a "problem -oriented" system of the kind 
used in these experiments. Human factors experiments on the performance of program- 
mers have been done (1,2,3), and they suggest that further experiments of that kind 
would be useful. The previous experiments, however, have been primarily concerned 
with the differences between on-line and "batch" systems. The experiments reported 
in this note are, in contrast, concerned with substantive users instead of programmers, 
and they are attempts to make a parametric study of system effects. That is, we are 
concerned with establishing functional relations rather than just looking for significant 
differences between conditions. 
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There have traditionally been two points of view on the amount of delay accep- 
ab±e in an on-line computer: one, that delay should be imperceptible, and that 
anything less ambitious is unacceptable in a really useful facility; the other, that 
the user is relatively unimportant compared with the machine, and that the time- 
sharing algorithm should therefore be designed to attain efficient machine 
performance, letting the delays fall where they may. Neither of these views really 
faces the question of designing a system for people to use in an organization that 
is concerned with costs. To be realistic, the designer must consider the trade- 
off between the cost of the computer and the time of the people who use it. One of 
the purposes of the present note is to show how empirical curves can be obtained 
to give the designer the data on user performance that he needs if he wants to design 
for minimum total cost. 

We conducted three experiments, all on the effects of response -delay, differing 
primarily in the type of task the subjects performed. Each experiment was designed 
to have greater face validity than the preceding one, and thus introduced greater 
complexity into the subjects' task. One index of this complexity was the successively 
larger repertoire of computational tools needed to deal with the three kinds of tasks. 
In the first experiment the subjects needed only one routine, in the second they had 
about a dozen available, and in the third they could use almost the complete Lincoln 
Reckoner, a library of about 80 routines. 
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GENERAL PROCEDURE 



Computation • ; ies Used 

The three experiments were performed on the TX-2 computer, an experimental 
machine at the Lincoln Laboratory. The computer operated under a time -sharing 
executive system known as APEX. (4) The Lincoln Reckoner (5), a sub -set of the 
programs in the APEX public library, was the facility the subjects used to work on 

the tasks. 

The Lincoln Reckoner has been described as "a time -shared system for on-line 
use in scientific and engineering research. . . it was designed ... to find out what 
features of such a service will have an important effect on the amount of work the 
user gets done . . it offers a library of routines that concentrate on one particular 
application, numerical computations on arrays of data. It is intended for use in 
feeling one's way through the reduction of data from a laboratory experiment, or in 
trying out theoretical computations of moderate size. (5, p433) 

Three basic features of a user -oriented system of this type are (5,pp437-8): 

1. Automatic application of routines . Almost all of the clerical 
work needed to perform an operation — i. e. , to apply a public routine — 
is done automatically. The system takes care of the location of the 
data, the dimensions of arrays and so forth. Ideally, all the user has 
to do is somehow indicate the operation he wants to apply, the data to 
which he wants it applied, and — perhaps — the way in which he will 




3 



identify the results when he wants to use them again 

2. Automatic retention of results in such a form that they can 
be used as operands for other routines. The results of operations 
are stored in such a fashion that they can be used later as inputs 
to other operations - including operations that the user did not 
have in mind when the results were obtained. He need only specify 
the name by which he wants to identify a result; the system remem- 
bers where the result has been stored and automatically records 
the descriptive information that will be needed if the result is to 
be used later as an operand -e. g. , the dimensions are recorded 
if the result is an array of numbers. 

3. Facilities for concatenation of routines . The user can define 
a sequence of operations and then use the sequence as he would 
use one of the primitive routines in the library. The new operation 
can be used as part of another sequence, and so on. 

Subjects had direct access to the computer by way of a terminal consisting of a 
Lincoln Writer keyboard and printer, and a CRT display. Across the room there 
was a Xerox high-speed printer that provided on-line hard-copy. Textual information 
and graphs were presented on the CRT or the Xerox at the user's request. The 

i 

graph -plotting automatically provided appropriate scales for axes, and would plot 
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up to three sets of x - y values. 

The Independent Variable: Delay in Output 



The independent variable in all three experiments was the amount of delay in 
each output from the computer. "Output, " as used here, means that the subject 
has requested some response, such as a type -out or a CRT display, or that an 
error message has been typed by the machine because he violated a syntactic or 
semantic rule of the system. It should be remembered throughout this paper that 
the Reckoner, unlike many on-line systems, does not reply to every line of typing. 
Commands are saved and executed when the system has time; an output is pro- 
duced only when the system gets to the execution of a command that requests an 
output, or a command that is in error. 

There were five experimental conditions: the nominal delay in each output was 
1, 3, 10, 30, or 100 sec. All five conditions were used in the first and second 
experiments, but only four conditions, all but the 3 sec. delay, were used in the 
third experiment. In the conditions in whicn the nominal delay was 3, 10, 30 or 
100 sec. , the actual delay of any one output varied within plus or minus 10% of the 
nominal value. In the condition in which the nominal delay was 1 sec. , the machine 
was actually responding as quickly as possible; that is, the actual delay was simply 
the time required to do the computation and prepare the output, plus occasionally, 
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some extra delay because the machine was being time -shared among the four 



\ 

subjects. 

Outputs were trapped to a program that decided the extent of the delay. The 
program first selected a delay at random from a table that was produced each time 
the subject started a new task. The program then compared the time of the previous 
output with the time of the carriage return that led to the current output (each com- 
mand is ended by a carriage return): the more recent of those two events was 
regarded as the start of the delay interval. If the delay interval was already greater 
than the selected delay by the time the trapping program gained control, the output 
was begun immediately. If, however, the selected delay was greater, the program 
waited until the delay interval became equal to the selected delay. The :ime of each 
carriage return, the selected delay, and the actual delay were recorded by the 
computer. 

The Dependent Variables 

The designer of an on-line computing service presumably would be interested 
in balancing the cost of the user's time against the cost of the computer system, 
and thus would want to know how the time that the user needs to complete his task 

1. In the first experiment the mean delay in this condition was approximately 
0. 7 sec. , in the second it was approximately 0. 4 sec. , and in the third it was, 
we judge, somewhere in between. 
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depends on the delay in the machine's response. In each of the three experiments 
we shall therefore present graphs showing how the time required to complete a 
typical task varies as a function of delay. 

Since the delay in the machine's response may well affect the rate at which the 
user requests service, the designer would like to know how the number of commands 
issued per minute varies from one delay condition to another. In each experiment 
we shall presen' graphs of number of outputs per minute, although in the third ex- 
periment the use of output rate as a measure of the load the user puts on the system 

is tenuous. In that experiment, in contrast to the first two, the number of commands 

2 

the subject gives between commands that produce outputs may very considerably. 

The measures we have just considered — completion time and output rate — are 
of interest to the system designer, but to understand the subject's behavior we need 
measures that reflect his behavior more directly. In each experiment we shall 
therefore present graphs of the number of outputs for a typical task and the net 
completion time for a typical task. The net completion time is defined as T -N x D, 
where T is the actual time the subject required to complete the task, N is the 
number of outputs he received, and D is the nominal delay in the condition under 
which he was working. Thus the net completion time is approximately the time 

2. It might be preferable to present the number of commands per unit time, but 
those data are not available yet. 



ERiC 



7 



required to complete the task, when the intervals during which the subject was 
waiting for an output are ignored. 

Subjects 

The four subjects of these experiments were the four authors of this note. 
Subjects knew the delay condition that was in effect on each task and acted under 
a set to finish each task as soon as possible. Subjects were free to take breaks 
between tasks. They worked for approximately two hours, excluding breaks, one 
evening a week, and each experiment required several weekly sessions. 

Since the subjects were also the experimenters they spent a considerable 
amount of time exploring the three task types before experimentation formally 
began. For this reason the role of practice effects (very small) may be mis- 
leading in the data analyses of the experiments. 



EXPERIMENT I: Railroad Track Tasks 



Procedure 

The first experiment, called the Railroad Track Experiment (RR), may be 
regarded as a very simple kind of problem-solving. Since it was the first experi- 
ment, we tried to choose a task that would be simple, but would require a large 

amount of interaction with the computer. 

At the beginning of each RR problem the subject typed a message that showed 

he was ready to begin, and the machine replied by displaying on the CRT a 5” * 5" 
graph of a pair of parallel curves (like railroad tracks) separated by approximately 
i ", and a horizontal line across the middle of the CRT. An example is shown in 
Fig. 1. The subject’s task was to manipulate the horizontal line until it fell between 
the pair of curved lines. The tool with which he manipulated the line was a command 
that altered the line by adding to it a "bump,” shaped like a Gaussian bell -curve, 
whose height, width, and horizontal location he specified. (Appendix A presents 
the details of the "bump” routine and how the subject specified its parameters. ) 

Each successive command cumulated with the previous ones, and automatically 
displayed the altered line superimposed on the railroad tracks, which remained 
fixed. The old display remained until the altered display appeared. (The time 
until the altered display appeared was, of course, the independent variable. ) When 
the manipulated line fell completely between the railroad tracks the time and a 
message "DONE" were typed, indicating that the problem was finished. 
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Fig. 1. An example of the initial display in the RR task. 




Each RR problem was selected independently by a random process that specified 
the locus of the center of the parallel curves. (Appendix A presents further details. ) 
Each subject did 25 tasks: the order of the five delay conditions (1, 3, 10, 30, and 
100 sec. ) was randomized over the 25 problems, with the restriction' that the sub- 
ject did five tasks under each condition. 

Results 

The experiment will be analyzed as a 4 x 5 factorial design (4 subjects and 
5 conditions of delay) with five trials per cell — 100 trials in all. Completion time 
was the interval between the appearance of the display that stated the task, and the 
appearance of the "DONE" that marked its completion. In this experiment every 
command produced an output (an error message or new display), thus the number 
of outputs was simply the number of commands the subject gave, excluding the 
command that showed he was ready to begin, but including the command that pro- 
duced the message "DONE." 

Completion time and output rate. - The arithmetic means of the times re- 
quired to complete the five tasks performed under each condition of delay are shown 

O 

in Fig. 2 for each subject. The number of outputs per minute under each delay 

3. The means plotted in Fig. 2 are arithmetic, not geometric means. The geometric 
mean might be a more stable statistic, but it is not what the designer of a computer 
system needs. Given the arithmetic mean of the times required to complete the tasks 
in some population, he can make an accurate estimate of the total time that will be 
required to complete, say 1000 tasks drawn at random from that population: he 
simply multiplies the mean by 1000. But given the geometric mean, there is no good 
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Fig. 2. Expenditure of the user's time: Arithmetic mean of time to 
complete a task in the RR experiment. (Log scales on both axes. ) 



